A Multi-Model-Approach to Improve Final Results
نویسندگان
چکیده
In this paper we apply multidimensional decompositions to improve modelling results. Results generated by a model usually include both wanted and destructive components. In case of a few models, some of the components can be common to all of them. Our aim is to find basis elements and distinguish the components with the positive influence on the modelling quality from the negative ones. After rejecting the negative elements from the models’ results we obtain better results in terms of some standard error criteria. The identification of the basis components can be performed by ICA and PCA transformations. The procedure of models decomposition and improvement is implemented in SAS/IML. The models’ errors are analysed in SAS BASE and the graphs are generated in SAS/GRAPH. The automation is performed using SAS MACRO language. Paper is addressed to the audience with data mining and statistical background. INTRODUCTION Data Mining (DM) is the process of finding trends and patterns in data [6,10]. Usually it aims at finding the previously unknown knowledge that could be used for business purposes such as fraud detection, client/ market segmentation, risk analysis, customer satisfaction, bankruptcy prediction, etc. The methodology of DM modelling can follow the SEMMA procedure introduced by SAS. It consists of parts: Sample, Explore, Modify, Model and Assess [11]. Typically, in data mining problem many models are tested and then, according to particular criterion, the best one is chosen. The other models are left out. In this paper we propose to utilize information given by them. The motivation of such methodology can be based on somewhat ambiguity formulation “the best model”. There are many different criteria which can indicate different models as the best one. On the other hand even if several model results are not the best according to specific criterion it is still possible to utilize them to improve the final effect. Usually, solutions of the model aggregation problem propose to combine a few models by mixing their results or parameters [7,15]. Our aim is to integrate the knowledge uncovered by the set of the models applying decomposition of the models results into signals, rejecting from them the destructive ones and operation inverse to previous decomposition [13, 14]. Such transformation can be done by means of Independent Component Analysis (ICA) and Principal Component Analysis (PCA) [8,9]. The presented methods use many signals and different decompositions and can utilize different criteria to find more accurate final result what leads to multivariate analysis what can be performed with SAS/IML. MODEL RESULTS’ INTEGRATION The models try to represent the dependency between input data and target, so they bring some knowledge about the real value [5]. We assume that each model results include two types of components: positive associated with target and destructive associated with inaccurate learning data, individual properties of models etc. Many of good and bad components are common to all the models due to the same target, learning data set, similar model structures or optimization methods. Our aim is to explore information given simultaneously by many models to identify and eliminate components with destructive impact on model results. We assume that result of i -th model , , with observations, is linear combination of positive impact components , and destructive components v , what gives i x m i ,..., 1 = N v ,..., p t t t ,..., , 2 1 q v , 2 1 . (1) q iq i p ip i i v v t t x β β α α + + + + + = ... ... 1 1 1 1 In close matrix form we have , (2) v β t α i i i x + = where: is a matrix of target components, is a matrix of residuals, are vectors of coefficients. In case of many models [ ]T p t t t ,..., , 2 1 = t [ ], ,..., 1 i p β α = β N p× ] q β [ ]T q v v v ,..., , 2 1 = v N q × ,..., [ 1 i α = α , (3) As x = 1 SUGI 30 Data Mining and Predictive Modeling
منابع مشابه
Model and Solution Approach for Multi objective-multi commodity Capacitated Arc Routing Problem with Fuzzy Demand
The capacitated arc routing problem (CARP) is one of the most important routing problems with many applications in real world situations. In some real applications such as urban waste collection and etc., decision makers have to consider more than one objective and investigate the problem under uncertain situations where required edges have demand for more than one type of commodity. So, in thi...
متن کاملA hybrid solution approach for a multi-objective closed-loop logistics network under uncertainty
The design of closed-loop logistics (forward and reverse logistics) has attracted growing attention with the stringent pressures of customer expectations, environmental concerns and economic factors. This paper considers a multi-product, multi-period and multi-objective closed-loop logistics network model with regard to facility expansion as a facility location–allocation problem, which more cl...
متن کاملMulti-period and Multi-objective Stock Selection Optimization Model Based on Fuzzy Interval Approach
The optimization of investment portfolios is the most important topic in financial decision making, and many relevant models can be found in the literature. According to importance of portfolio optimization in this paper, deals with novel solution approaches to solve new developed portfolio optimization model. Contrary to previous work, the uncertainty of future retur...
متن کاملA Three-Echelon Multi-Objective Multi-Period Multi-Product Supply Chain Network Design Problem: A Goal Programming Approach
In this paper, a multi-objective multi-period multi-product supply chain network design problem is introduced. This problem is modeled using a multi-objective mixed integer mathematical programming. The objectives are maximizing the total profit of logistics, maximizing service level, and minimizing inconsistency of operations. Several sets of constraints are considered to handle the real situa...
متن کاملMulti-objective Modeling Based on Competition Airlines Cooperation by Game Theory and Sustainable Development Approach
In each time period, the demand of passengers for each route are finite and airlines compete for earning more profits. The complex competition among airlines causes problems, such as complicating flight planning and increasing empty seats for some routes. These problems increase air pollution and fuel consumption. To solve these problems, this research studies the cooperation of the airlines wi...
متن کاملA multi-product, multi-period model to select supplier for deteriorating products while considering uncertainty as well as backorder
Determining supplier and optimum order of the quantity is an issue of great importance in logistics management for many companies. In this regard, it is crucial to determine the best decisions for the order quantity as well as the most suitable supplier through considering existing limitations and uncertainties. To optimize a multi-product, multi-period model with select supplier for deteriorat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001